De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification
نویسندگان
چکیده
Chickpea ranks third among the food legume crops production in the world. However, the genomic resources available for chickpea are still very limited. In the present study, the transcriptome of chickpea was sequenced with short reads on Illumina Genome Analyzer platform. We have assessed the effect of sequence quality, various assembly parameters and assembly programs on the final assembly output. We assembled ~107million high-quality trimmed reads using Velvet followed by Oases with optimal parameters into a non-redundant set of 53 409 transcripts (≥100 bp), representing about 28 Mb of unique transcriptome sequence. The average length of transcripts was 523 bp and N50 length of 900 bp with coverage of 25.7 rpkm (reads per kilobase per million). At the protein level, a total of 45 636 (85.5%) chickpea transcripts showed significant similarity with unigenes/predicted proteins from other legumes or sequenced plant genomes. Functional categorization revealed the conservation of genes involved in various biological processes in chickpea. In addition, we identified simple sequence repeat motifs in transcripts. The chickpea transcripts set generated here provides a resource for gene discovery and development of functional molecular markers. In addition, the strategy for de novo assembly of transcriptome data presented here will be helpful in other similar transcriptome studies.
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملComprehensive Transcriptome Assembly of Chickpea (Cicer arietinum L.) Using Sanger and Next Generation Sequencing Platforms: Development and Applications
A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptome...
متن کاملGenome Analysis Gene Discovery and Tissue-Specific Transcriptome Analysis in Chickpea with Massively Parallel Pyrosequencing and Web Resource Development
Chickpea (Cicer arietinum) is an important food legume crop but lags in the availability of genomic resources. In this study, we have generated about 2 million high-quality sequences of average length of 372 bp using pyrosequencing technology. The optimization of de novo assembly clearly indicated that hybrid assembly of long-read and short-read primary assemblies gave better results. The hybri...
متن کاملOptimization of De Novo Short Read Assembly of Seabuckthorn (Hippophae rhamnoides L.) Transcriptome
Seabuckthorn (Hippophaerhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of th...
متن کاملDe novo assembly, functional annotation, and marker development of Asian pear (Pyrus pyrifolia) fruit transcriptome through massively parallel sequencing.
This study investigated the Asian pear transcriptome using the RNA-Seq normalized fruit cDNA library to create a transcriptomic resource for unigene and marker discovery. Following the removal of lowquality reads, 127,085,054 trimmed reads were assembled de novo to yield 37,649 non-redundant unigenes with an average length of 599 bp. Alternative splicing events were detected in 4121 contigs. A ...
متن کامل